LMSim : Computing Domain-specific Semantic Word Similarities Using a Language Modeling Approach

نویسندگان

  • Sachin Pawar
  • Swapnil Hingmire
  • Girish Keshav Palshikar
چکیده

We propose a method to compute domain-specific semantic similarity between words. Prior approaches for finding word similarity that use linguistic resources (like WordNet) are not suitable because words may have very specific and rare sense in some particular domain. For example, in customer support domain, the word escalation is used in the sense of “problem raised by a customer” and therefore in this domain, the words escalation and complaint are semantically related. In our approach, domain-specific word similarity is captured through language modeling. We represent context of a word in the form of a set of word sequences containing the word in the domain corpus. We define a similarity function which computes weighted Jaccard similarity between the set representations of two words and propose a dynamic programming based approach to compute it efficiently. We demonstrate effectiveness of our approach on domain-specific corpora of Software Engineering and Agriculture domains.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supporting Variability with Late Semantic Adaptations of Domain-Specific Modeling Languages

Meta-object protocols are used to open up the implementations of object-oriented general-purpose languages to support semantic variability. They enable performing application-level semantic adaptations to the language even at runtime. However, such meta-object protocols are not available for domain specific-modeling languages. Also, existing approaches to implementing domain-specific modeling l...

متن کامل

The evolution of the meaning of the word nurse based on the classical texts of Persian literature

Background and Aim: The semantic evolution of a word over time is inevitable, indicating a social, political, religious or cultural process. Nurse is one of the words that has a significant presence in Persian literature texts and has been used in many different meanings such as slave, servan, maid, devotee, obedient, patient and preserver. The purpose of this study is to show its semantic ev...

متن کامل

Domain-Specific Semantic Class Disambiguation Using WordNet

This paper presents an approach which exploits general-purpose algori.t~m~ and resources for domain-specific semantic class dis~mhiguation, thus facilitating the generalization of semautic patterns fTom word-based to class-based representations. Through the mapping of the donza£uspecific semantic hierarchy onto WordNet and the application of general-purpose word sense disambiguation and semanti...

متن کامل

An Executive Approach Based On the Production of Fuzzy Ontology Using the Semantic Web Rule Language Method (SWRL)

Today, the need to deal with ambiguous information in semantic web languages is increasing. Ontology is an important part of the W3C standards for the semantic web, used to define a conceptual standard vocabulary for the exchange of data between systems, the provision of reusable databases, and the facilitation of collaboration across multiple systems. However, classical ontology is not enough ...

متن کامل

Semantic Anchoring with Model Transformations

Model-Integrated Computing (MIC) is an approach to Model-Driven Architecture (MDA), which has been developed primarily for embedded systems. MIC places strong emphasis on the use of domain-specific modeling languages (DSML-s) and model transformations. A metamodeling process facilitated by the Generic Modeling Environment (GME) tool suite enables the rapid and inexpensive development of DSML-s....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014